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ABSTRACT 

Realistic  dynamic  systems  are  often  strongly  nonlinear,  particularly  those  for  the  ocean  and  atmosphere. 
Applying  variational  data  assimilation  to  these  systems  requires  a  tangent  linearization  of  the  nonlinear 
dynamics  about  a  background  state  for  the  cost  function  minimization.  The  tangent  linearization  may  be 
accurate  for  limited  time  scales.  Here  it  is  proposed  that  linearized  assimilation  systems  may  be  accurate  if 
the  assimilation  time  period  is  less  than  the  tangent  linear  accuracy  time  limit.  In  this  paper,  the  cycling 
representer  method  is  used  to  test  this  assumption  with  the  Lorenz  attractor.  The  outer  loops  usually 
required  to  accommodate  the  linear  assimilation  for  a  nonlinear  problem  may  be  dropped  beyond  the  early 
cycles  once  the  solution  (and  forecast  used  as  the  background  in  the  tangent  linearization)  is  sufficiently 
accurate.  The  combination  of  cycling  the  representer  method  and  limiting  the  number  of  outer  loops 
significantly  lowers  the  cost  of  the  overall  assimilation  problem.  In  addition,  this  study  shows  that  weak 
constraint  assimilation  corrects  tangent  linear  model  inaccuracies  and  allows  extension  of  the  limited 
assimilation  period.  Hence,  the  weak  constraint  outperforms  the  strong  constraint  method.  Assimilated 
solution  accuracy  at  the  first  cycle  end  is  computed  as  a  function  of  the  initial  condition  error,  model 
parameter  perturbation  magnitude,  and  outer  loops.  Results  indicate  that  at  least  five  outer  loops  are 
needed  to  achieve  solution  accuracy  in  the  first  cycle  for  the  selected  error  range.  In  addition,  this  study 
clearly  shows  that  one  outer  loop  in  the  first  cycle  does  not  preclude  accuracy  convergence  in  future  cycles. 


1.  Introduction 

The  representer  method  (Bennett  1992)  is  a  four- 
dimensional  variational  data  assimilation  (4DVAR)  al¬ 
gorithm  that  relies  on  the  adjoint  of  the  dynamical 
model  and  expresses  the  analyzed  solution  as  a  first- 
guess  plus  a  finite  linear  combination  of  representer 
functions,  one  per  datum.  The  explicit  computation  and 
storage  of  all  the  representer  functions  (direct  method), 
however,  is  not  required  since  the  method  can  be  imple¬ 
mented  indirectly  using  the  conjugate  gradient  method 
(CGM:  Amodei  1995;  Egbert  et  al.  1994).  A  description 
of  the  representer  methodology  is  provided  in  the  ap¬ 
pendix. 

The  representer  method  has  earned  an  established 
reputation  as  an  advanced  data  assimilation  technique 
within  the  past  decade,  and  has  gained  the  attention  of 
many  potential  operational  users.  Two  primary  issues. 
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however,  need  to  be  addressed  prior  to  implementing 
the  representer  method  operationally. 

The  first  issue  that  is  addressed  in  this  paper  is  the 
stability  and  validity  of  the  tangent  linear  model  (TLM) 
and  how  it  impacts  the  assimilation  accuracy.  When  the 
representer  method  is  applied  to  a  nonlinear  model,  the 
model  must  be  linearized,  preferably  using  the  first- 
order  approximation  of  Taylor's  expansion.  Tradition¬ 
ally.  the  representer  method  has  been  implemented  for 
the  assimilation  of  all  observations  in  the  time  window 
considered.  As  with  every  other  variational  data  assimi¬ 
lation  method  with  nonlinear  dynamics,  the  representer 
method  necessitates  that  the  TLM  and  its  adjoint  be 
valid  and/or  stable  over  the  entire  assimilation  time 
window.  The  validity  of  the  TLM  is  difficult  to  maintain 
over  a  long  time  period  for  strong  nonlinear  models  and 
complex  regions. 

The  second  issue  that  is  addressed  in  this  paper  is  the 
computational  cost  of  the  representer  method.  T  he  in¬ 
direct  representer  method  requires  the  integration  of 
the  adjoint  and  TLM  within  a  CGM  that  determines  the 
representer  coefficients  for  the  minimization  of  the  cost 
function  (see  the  appendix).  This  set  of  representer  co- 
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efficients  is  then  used  to  provide  it  correction  to  the 
background.  The  number  of  iterations  of  the  COM 
(this  is  referred  to  as  the  inner  loop)  is  typically  a  small 
fraction  of  the  total  number  of  measurements.  Outer 
loops  are  typically  required  for  strongly  nonlinear  sys¬ 
tems.  To  initialize  the  outer  loop,  one  would  pick  a  first 
background  solution  around  which  the  model  is  linear¬ 
ized.  The  best  solution  (corrected  background)  ob¬ 
tained  from  this  assimilation  would  become  the  back¬ 
ground  for  the  next  outer  loop,  and  so  forth  until  formal 
convergence  is  reached  (Bennett  et  al.  19%;  Ngodock 
el  al.  2000;  Bennett  2002:  Chua  and  Bennett  2001:  Muc- 
cino  and  Bennett  2002).  This  outer  loop  exacerbates 
the  computational  cost  of  the  representer  method.  In 
this  study  the  background  that  serves  for  linearization  is 
also  taken  as  the  first  guess. 

These  two  issues  have  detracted  many  potential  users 
of  the  representer  method  for  operational  purposes.  It 
is  possible  however,  to  address  these  issues  and  imple¬ 
ment  the  representer  method  at  a  reasonable  cost  for 
operational  applications.  Given  a  time  window  in  which 
one  desires  to  assimilate  observations,  it  is  possible  to 
apply  the  representer  method  over  cycles  of  subinter¬ 
vals.  The  name  adopted  for  this  approach  is  the  cycling 
representer  method  (Xu  and  Daley  2000).  and  its  asso¬ 
ciated  solution  is  called  the  cycling  solution.  The  solu¬ 
tion  that  is  obtained  by  assimilating  all  the  observations 
at  once  in  the  original  full  time  window  will  be  called 
the  global  solution. 

By  using  the  cycling  representer  method,  the  assimi¬ 
lation  time  window  is  constrained  to  a  period  over 
which  the  TLM  produces  an  accurate  dynamical  repre¬ 
sentation  of  the  Lorenz  attractor.  Doing  this  reduces 
the  need  for  outer  loops.  The  outer  loop  is  designed  to 
solve  the  nonlinear  Euler-Lagrange  conditions  associ¬ 
ated  to  the  assimilation  problem  with  the  nonlinear 
model,  because  the  representer  method  solves  a  linear 
assimilation  problem.  In  the  global  solution  problem, 
the  TLM  may  not  be  an  accurate  representation  of  the 
solution,  and  the  adjoint  would  not  be  an  accurate  es¬ 
timate  of  the  derivative  of  the  state  with  respect  to  the 
control  variables.  If  the  TLM  is  an  accurate  represen¬ 
tation  of  the  solution,  the  need  for  outer  loops  is  re¬ 
moved.  In  the  initial  cycles  of  this  assimilation  ap¬ 
proach.  the  first-guess  or  background  solution  may  not 
be  accurate  and  thus  outer  loops  may  be  required.  Once 
the  system  is  spun  up  and  the  TLM  is  an  accurate  ap¬ 
proximation  (thanks  to  improved  background  solu¬ 
tions).  outer  loops  may  no  longer  be  necessary,  thus 
lowering  the  computational  cost  of  the  assimilation. 
There  may  be  situations  in  real-world  applications, 
however,  where  a  few  outer  loops  would  be  needed  in 
the  current  cycle,  even  though  a  single  outer  loop  suf¬ 


ficed  in  previous  cycles.  An  example  of  this  occurrence 
could  be  a  nonlinear  ocean  response  (advection  or  mix¬ 
ing)  to  a  sudden,  strong  change  in  atmospheric  forcing, 
especially  in  coastal  areas  with  complex  bathymetry. 
Nevertheless,  the  need  of  additional  outer  loops  is  as¬ 
sessed  by  the  discrepancy  between  the  assimilated  so¬ 
lution  and  the  data. 

The  idea  of  cycling  the  representer  method  was  in¬ 
vestigated  by  Xu  and  Daley  (2000)  using  a  II)  linear 
transport  model  with  synthetic  data.  In  that  study,  the 
error  covariance  of  the  analyzed  solution  was  updated 
at  the  end  of  each  cycle  and  used  as  the  initial  error 
covariance  in  the  next  cycle.  Another  application  of  the 
cycling  representer  method  was  performed  by  Xu  and 
Daley  (2002)  using  a  2D  linear  unstable  barotropic 
model  with  no  dynamical  errors.  In  the  2002  study,  the 
covariance  at  the  end  of  the  cycle  is  not  updated  be¬ 
cause  its  computation  is  too  costly  to  be  practical.  Up¬ 
dating  the  covariance  requires  the  evaluation  and  stor¬ 
age  of  the  representer  functions  at  the  final  time.  These 
two  studies  found  that  updating  the  covariance  at  the 
end  of  each  cycle  produced  significantly  more  accurate 
analyses.  However,  in  these  two  applications  of  the  cy¬ 
cling  representer  method,  only  linear  models  were  used 
and  thus  there  were  no  need  for  a  TLM.  Most  realistic 
applications  are  nonlinear  and  their  TLM  may  not  be 
stable  over  the  time  window  considered.  It  is  in  this 
context  that  this  study  applies  the  cycling  representer 
method. 

A  good  dynamical  system  candidate  for  testing  as¬ 
similation  methods  for  strongly  nonlinear  models  is  the 
acclaimed  Lorenz,  attractor  model  (Lorenz.  1963:  big. 
I).  It  has  been  used  in  numerous  to  studies  to  examine 
the  behavior  of  assimilation  methods  based  on  sequen¬ 
tial  filtering  and  variational  techniques.  This  is  done 
with  the  intent  that  if  an  algorithm  performs  satisfac¬ 
torily  well  with  this  model,  then  it  can  be  applied  to 
atmospheric  and  ocean  models,  which  is  a  necessary  but 
not  a  sufficient  condition. 

Gauthier  (1992)  implemented  the  strong  constraint 
variational  assimilation  method  using  the  adjoint  of  the 
Lorenz,  equations  and  a  rather  sparse  data  network 
sampled  every  two  time  units.  He  found  that  the  cost 
function  was  well  behaved  when  the  model  did  not  un¬ 
dergo  a  transition.  However,  the  solution  was  very  sen¬ 
sitive  to  initial  perturbations.  This  problem  is  closely 
related  to  the  stability  and  validity  of  the  TLM  approxi¬ 
mation  during  the  different  phases  of  the  model. 

Miller  et  al.  ( 1994)  also  implemented  the  strong  con¬ 
straint  variational  assimilation  method  with  the  Lorenz 
attractor  model  and  found  that  the  behavior  of  the  cost 
function  encountered  by  Gauthier  (1992)  was  strongly 
dependent  on  the  length  of  the  assimilation  window. 
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Fui.  I .  The  Lorenz  attractor  solution  computed  fur  20  time  units 
usm £  the  RK4  time-stepping  scheme  with  a  time  step  of  dt 
t/h(K).  This  is  the  true  solution  from  which  observations  are 
sampled  every  %  time  unit. 

However,  contrary  to  Gauthier  (1992),  Miller  et  al. 
(1994)  did  not  derive  the  adjoint  equations  for  the  mini¬ 
mization  of  the  cost  function.  They  computed  the  gra¬ 
dient  of  the  cost  function  directly  and  invoked  a  con¬ 
jugate  gradient  routine  to  minimize  this  cost  function. 
The  assimilated  solution  computed  by  Miller  et  al. 
(1994).  fitted  the  data  (sampled  every  11.25  time  units) 
for  7  time  units.  The  forecast  following  the  assimilation 
was  able  to  track  the  data  up  to  three  additional  time 
units  before  errors  increased  significantly. 

Evensen  and  Fario  (1997)  implemented  the  weak 
constraint  variational  assimilation  method  using  the 
Lorenz  attractor  equations  with  the  same  configuration 
as  Miller  el  al.  (1994).  By  including  a  dynamic  error  and 
directly  computing  the  gradient  of  the  cost  function 
with  respect  to  all  state  variables  (in  space  and  time), 
they  were  able  to  use  a  gradient  descent  algorithm  to 
accurately  assimilate  data  (sampled  every  0.25  time 
unit)  for  longer  time  periods  (20  and  60  time  units)  with 
no  unexpected  behavior  of  the  cost  function.  However, 
when  the  data  density  decreased,  the  same  method  pro¬ 
duced  only  a  local  minimum  of  the  cost  function  that 
resulted  in  three  missed  transitions  within  an  interval  of 
20  lime  units.  This  suggests  that  the  good  fit  that  Miller 
et  al.  ( 1994)  obtained  for  seven  time  units  in  the  strong 
constraint  approach  depends  on  the  rather  dense  data 
network.  Hole  that  in  Evensen  and  Fario  (1997).  there 
is  no  need  of  the  TLM  or  the  adjoint. 

Realistic  applications,  however,  will  not  always  have 
the  luxury  of  an  exact  hand-computed  gradient  of  the 
cost  function.  The  computation  of  the  gradient  typically 
relies  on  the  adjoint,  which  in  turn  relies  on  the  TLM. 


Because  of  the  chaotic  behavior  of  realistic  models,  any 
linearization  is  subject  to  error  growth  therefore  limit¬ 
ing  the  application  of  the  TLM  approximation  to  a  time 
scale  for  which  errors  are  lower  than  a  prescribed 
threshold.  Our  approach  is  different  from  the  studies 
mentioned  above  in  Lhat  we  compute  the  gradient  of 
the  cost  function  by  means  of  the  adjoint  and  imple¬ 
ment  the  weak  constraint  variational  assimilation  using 
the  cycling  rcpresenier  method.  In  this  paper,  the  glob¬ 
al  and  cycling  solutions  are  compared  and  the  ability  to 
remove  the  outer  loops  is  examined.  In  these  experi¬ 
ments  a  significance  test  is  not  performed.  This  would 
turn  the  assimilation  problem  into  a  search  for  suitable 
prior  assumptions  about  the  data,  initial  condition,  and 
dynamical  errors,  and  hence  cloud  the  issue  at  hand. 
This  study  specifies  prior  error  covariances  similar  to 
those  used  in  previous  studies.  The  Lorenz  model  and 
the  variational  assimilation  problem  are  described  in 
section  2,  and  the  TLM  stability  and  accuracy  are  dis¬ 
cussed  in  section  3.  Section  4  deals  with  the  assimilation 
experiments.  A  discussion  and  concluding  remarks  fol¬ 
low  in  sections  5  and  6.  respectively. 

2.  The  Lorenz  model  and  the  variational 

assimilation  problem 

the  Lorenz  model  is  a  coupled  system  of  three  non¬ 
linear  ordinary  differential  equations: 


—  =  px  -  y  -  .vc  +  q'\  and 

‘j;=xy  ~  Pz  +  q\  U) 

where  jc,  \\  and  z  are  the  dependent  variables.  The 
commonly  used  time  invariant  coefficients  arc  a  =  28, 
p  ~  10,  and  (i  =  8/3  and  the  model  errors  are  repre¬ 
sented  by  q\  q'\  and  cf .  The  initial  conditions  for  Bq. 
( 1 )  are 

.v(0)  =  x0  +  i\ 

v(0)  -  v0  +  i}\  and 

z(0)  =  z<>  +  (2) 

where  =  1 .508  87.  y(1  =  - 1 ,53 1  271,  and  =  25.460  9  i 
are  the  first  guess  of  the  initial  conditions.  These  are 
the  same  values  that  are  used  in  the  data  assimilation 
studies  by  Miller  et  al.  ( 1994),  Evensen  ( 1997),  Evensen 
and  Fario  (1997).  Miller  et  al.  (1999),  and  Evensen  and 
Van  Lee u wen  (2000).  The  initial  condition  errors  are 
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represented  by  i  \  i\  and  i\  By  selling  the  model  and 
initial  condition  errors  in  Eqs.  ( I )  and  (2)  to  zero,  the 
solution  to  the  Lorenz  attractor  is  computed  for  the 
time  interval  |t)*  20]  using  the  fourth-order  Runge- 
Kutta  (RK4)  discretization  scheme  with  a  time  step  of 
tlr  =  1/600  (Fig.  1).  This  solution  is  labeled  as  the  true 
solution*  since  using  time  steps  smaller  than  dt  =  1/600 
does  not  significantly  alter  the  solution  within  the  speci¬ 
fied  time  period.  Thus*  the  numerical  representation 
has  converged* 

In  the  lime  interval  |(),  20]  there  is  a  set  of  M  obser¬ 
vations  cl  e  9tA/  such  that 

cl  =  z)  +  f*  (3) 

where  H  is  a  linear  measurement  functional  (A#  x  3 
matrix)  and  f  €  Xf  is  the  vector  of  measurement  er¬ 
rors*  The  data  used  for  all  assimilation  experiments  are 
sampled  from  the  true  solution  (Fig*  l )  with  a  frequency 
of  0.25  time  units*  Since  there  are  three  position  vari¬ 
ables*  there  are  a  total  of  M  =  237  measurements.  Pur¬ 
posefully*  data  is  not  sampled  at  T  =  0  or  T  -  20  in 
order  to  demonstrate  the  capability  of  the  assimilation 
method  to  propagate  the  information  from  future 
(past)  observations  to  correct  the  initial  (final)  condi¬ 
tions*  The  measurement  error  is  assumed  to  be  h  = 
0*002*  and  its  covariance  matrix  is  assumed  to  be  diago¬ 
nal. 

In  general,  the  errors  in  Eqs.  (1 ).  (2),  and  (3)  are  not 
known.  Therefore  solving  Eq.  (1 )  directly  is  an  ill-posed 
problem*  By  using  generalized  inversion*  a  solution  to 
Eq.  ( I )  is  obtained  that  minimizes  the  errors  in  Eqs* 
( 1 )— (3 )  in  a  weighted  least  squares  sense  using  a  cost 
function; 


/  r 


ii*i\y,  z)  ~  j  j  q(/, )Ww(f„  Mq(f2) rf/,  dt2  +  i'  W„i 


+  r  wr. 
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The  weights  (W,;r  W  r  and  w)  in  Eq.  (4)  arc  defined  as 
the  inverses  of  the  respective  error  covariance  matrices 
Clfir  Cir  and  C,  for  the  model*  initial  condition*  and 
measurement  errors*  respectively. 

For  the  sake  of  clarity*  Eqs*  ( I  )-(2)  arc  rewritten 
here  in  vector  notation: 

dx 

j,  -  fis| + •' 

x((l)  =  Xo  +  i*  (5) 

where  x  is  the  state  vector  consisting  of  the  three  inde¬ 
pendent  variables  (\\  y*  z)  at  each  time  step  and  F(x)  is 


the  model  dynamics.  The  observations  may  be  written 
in  a  similar  notation: 


<1  H\  +  f. 


(6) 


In  real-world  problems*  the  observation  operator  H 
may  be  nonlinear  for  some  types  of  measurements  (c.g., 
radiances  in  the  atmosphere*  acoustic  tomography  in 
the  ocean)  therefore  requiring  a  tangent  linear  approxi¬ 
mation  so  that  the  represen ler  method  may  be  used. 
The  Euler- Lagrange  (EL)  conditions  for  a  local  extre¬ 
mum  of  the  cost  function  are  derived  by  setting  the  first 
variation  of  Eq.  (4)  to  zero.  These  conditions  are 
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and 
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S(0)  =  x(,  +  C„A{0).  IS) 

In  the  above  equations*  x  is  the  optimal  solution  and  A 
is  the  weighted  residual  or  adjoint  variable  defined  as 
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The  initial  condition  error  that  is  used  to  perturb  Eq. 
(2)  is  specified  to  be  10%  of  the  standard  deviation  of 
each  state  variable  of  the  true  solution  (/'  =  0*784,  iy  = 
0*897,  and  r  —  0.870)*  The  initial  condition  error  co- 
variance  (C„)  is  simply  a  3  x  3  diagonal  matrix  with 
values  equal  to  the  square  of  the  RMS  of  these  initial 
condition  errors* 

The  model  errors  used  to  perturb  Eq,  { l )  have 
unique  values  for  each  dependent  variable  at  each  time 
step  of  the  model  integration.  The  statistics  of  these 
model  errors  are  estimated  by  first  computing  a  de¬ 
graded  solution  using  a  time  step  of  dr  1/60.  The 
difference  between  this  degraded  solution  and  the  truth 
is  examined  up  to  the  point  where  the  solutions  diverge 
significantly  (5.4  time  units).  The  RMS  of  the  differ¬ 
ences  between  the  true  and  degraded  solutions  prior  to 
this  time  is  divided  by  the  number  of  time  steps  during 
this  time  period  (5.4  time  units)  to  obtain  an  estimate 
for  the  standard  deviation  (SID)  of  the  model  error 
|STD(r/)  =  3.69  X  10  *].  By  examining  the  temporal 
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correlation  of  these  differences,  a  decorrelation  time 
scale  of  r  =  0,25  lime  units  is  estimated  for  the  model 
error.  Therefore,  the  model  errors  are  created  by  first 
generating  three  time  series  of  Gaussian  distributed 
random  numbers  (one  for  each  dependent  variable),  A 
temporal  convolution  of  these  random  errors  is  then 
conducted  with  the  prescribed  exponential  function 
exp{-[(r  -  /')/t]2|.  This  array  of  correlated  errors  is 
finally  scaled  to  have  the  estimated  STD  (3.69  x  10 
The  spatial  covariance  of  these  dynamic  errors  (i.e.,  the 
stationary  part  of  Cf/f/)  is 


136  x  10  5  5.99  X  10  7  - 1 .56  x  10  ^ 

5.99  X  10 


1.36  x  10  5  -2,07  X  10  * 


x  it) 


_  —  1 .56  x  10  6  —2,07  x  10  6  1.36 


3.  The  TLM 


The  dynamical  model  in  Eq.  (5)  is  tangent  linearized 
around  a  first-guess  trajectory  x/(f)  using  the  first-order 
approximation  of  Taylor's  expansion: 


tlx 

~di 


„  f  m*f)  , 

Fix  l  +  — t — Mx  -  xf  ) 
dx 


q 


x(0)  =  xfa  +  fix,,  +  i.  (II) 

In  the  remainder  of  this  paper,  x'  is  referred  to  as  the 
background  solution.  The  assimilation  problem  is  then 
solved  with  the  linearized  model  [Eq.  (11)]  using  the 
representer  method.  The  re  presen  ter  expansion  can  be 
written  for  the  linear  EL  system  as  follows  (also  see 
appendix): 


M 

X(f)  =  xV)  +  2  (  12) 

rn=  I 

By  choosing  xf  for  both  the  background  (for  lineariza¬ 
tion)  and  the  first  guess  (for  assimilation),  one  can  for¬ 
mulate  the  assimilation  problem  as  a  search  for  the 
optimal  correction  to  xf  (Uboldi  and  Kamachi  2000: 
Jacobs  and  Ngodock  2003),  that  is,  the  second  term  in 
the  right-hand  side  of  Eq.  (12).  The  solution  x  from  the 
assimilation  is  taken  as  the  new  trajectory  around  which 
the  model  is  linearized  (i.e..  x  replaces  xf  in  the  next 
iteration  of  the  outer  loop). 

There  are  three  problems  that  are  associated  with 
this  approach.  First,  the  convergence  of  the  linear  it¬ 
erations  (outer  loops)  depends  on  the  background,  and 
the  background  may  not  always  be  sufficiently  accu¬ 
rate,  Second,  the  TLM  around  the  background  may  not 
always  be  valid  and  accurate  over  the  entire  assimila¬ 
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Fig,  2.  Stability  of  ihe  TLM  relative  to  ihe  RMS  of  the  (at  initial 
condition  and  (h)  model  error.  For  each  RMS  error*  the  TLM  was 
computed  1000  Limes,  each  time  perturbed  by  a  new  set  of  ran¬ 
domly  generated  numbers  scaled  by  the  specified  RMS  error.  The 
solid  line  is  the  mean  of  these  realizations  and  the  dotted  lines  are 
±  l  STD.  The  gray  asterisk  depicts  the  errors  that  are  used  in  the 
experiments  presented  in  this  paper 

lion  window,  especially  when  we  arc  dealing  with 
strongly  nonlinear  dynamical  models.  This  is  because 
the  accuracy/validity  of  the  TLM  is  limited  by  the 
growth  of  errors.  Finally,  the  outer  loop  amplifies  the 
cost  of  the  assimilation  associated  with  a  linearized 
model,  and  renders  the  cost  of  the  representer  method 
expensive  for  operational  purposes. 

The  accuracy  of  the  TLM  depends  on  the  initial  con¬ 
dition  and  model  errors.  The  estimation  of  these  errors 
is  described  in  section  2.  For  completeness*  the  TLM 
accuracy  limit  for  a  range  of  error  levels  in  both  the 
initial  conditions  and  model  are  separately  examined. 
The  experiments  for  each  type  of  error  contain  HX) 
different  values  for  the  RMS  error  ranging  from  0  to  I 
(initial  condition  error)  and  0  to  0.01  (model  error).  For 
each  RMS  error,  the  TLM  was  computed  1(H)()  times 
(for  statistical  significance),  and  in  each  run  a  different 
random  realization  of  the  errors  was  used.  The  accu¬ 
racy  criterion  for  the  TLM  is  based  on  the  1000  member 
ensembles  and  is  chosen  as  the  time  w^hen  the  differ¬ 
ence  between  the  TLM  and  the  background  exceeds 
the  standard  deviation  of  the  background.  In  Fig.  2  ihe 
mean  and  ±  I  STD  of  the  1000-member  TLM  accuracy 
time  is  shown  along  with  the  accuracy  lime  obtained 
with  our  choice  of  errors  in  section  2  (gray  asterisks). 
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This  realization  (gray  asterisks)  is  the  only  one  used  for 
the  experiments  throughout  the  remainder  of  this  pa¬ 
per.  Assimilations  using  significantly  large  ensembles 
would  be  prohibitive  and  the  results  difficult  to  display. 

In  section  2,  the  RMS  of  the  initial  condition  error  is 
estimated  to  be  0*852.  Using  one  realization  of  this  ram 
dom  initial  condition  error,  the  TLM  accuracy  is  0,4 
time  unit,  which  is  less  than  the  mean  accuracy  time  of 
the  1000  member  ensemble.  This  rather  short  accuracy 
time  given  by  the  initial  condition  error  indicates  po¬ 
tential  difficulties  for  the  assimilation  in  the  first  cycle. 
In  the  previous  section,  the  RMS  of  the  model  error  is 
estimated  to  be  0.003  69  and  using  one  realization  of 
this  model  error,  the  TLM  accuracy  is  4.077  time  units, 
which  is  very  close  to  the  mean  of  the  1000  member 
ensemble.  This  implies  that  for  the  choice  of  our  one 
realization  of  initial  and  model  errors,  once  the  initial 
condition  errors  have  been  corrected  in  the  first  cycle, 
the  TLM  will  have  a  longer  time  of  validity  in  subse¬ 
quent  cycles,  allowing  for  a  possible  extension  of  the 
cycle  length. 

The  initial  and  model  errors  selected  for  this  paper 
are  applied  to  the  TLM  [Eq.  (II)],  where  the  back¬ 
ground  \f  is  the  nonlinear  solution  obtained  with  the 
initial  conditions  (,v0.  y(),  using  the  RK4  time  step¬ 
ping  with  dt  =  1/60,  The  lime  series  of  the  background 
and  TLM  trajectories  are  shown  in  Fig.  3,  The  TLM 
strays  from  the  background  in  less  than  half  a  time  unit. 
This  deviation  of  the  TLM  is  caused  by  the  magnitude 
of  errors  and  their  growth,  as  well  as  the  early  transition 
that  the  background  undergoes  around  t  =  0.5.  The 
TLM  follows  this  transition  with  a  slight  delay  and 
wrong  amplitude.  After  this  transition,  the  errors  con¬ 
tinue  to  grow  unbounded  and  the  TLM  solution  does 
not  recover. 


4.  The  experiments 

The  previous  section  demonstrates  how  the  TLM  ac- 
cu ra cy  is  1  i m i le d  in  l i m e  g i ve n  t h e  e x pc cted  initial  con¬ 
dition  and  model  errors.  For  the  choice  of  errors  in 
section  2.  the  TLM  is  accurate  for  only  0.4  time  units,  h 
is  clear  that  under  these  conditions,  we  cannot  expect  a 
satisfactory  assimilation  solution  for  the  entire  assimi¬ 
lation  window  [0,  20].  The  background  trajectory  used 
to  compute  the  global  solution  is  obtained  by  integrat¬ 
ing  the  nonlinear  model,  and  adding  the  initial  condi¬ 
tion  and  model  errors  described  in  section  2,  Note  that 
although  the  rule  of  the  background  for  linearization 
purposes  is  the  same  in  the  assimilation  problem  and  in 
the  TLM,  the  background  need  not  be  the  same  in  both. 
In  the  TLM,  the  background  and  the  perturbation  er- 
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Flo.  3.  Comparison  between  the  background  solution  (BG)  and 
the  TLM  linearized  about  this  background.  The  background  is  the 
Lorenz  attractor  computed  using  the  RK4  time-stepping  scheme 
with  a  lime  step  of  dt  =  1/hO.  and  l he  coefficients  and  initial 
conditions  stated  in  the  text.  The  TLM  initial  conditions  and  dy¬ 
namics  are  perturbed  from  the  background  by  the  initial  condition 
and  model  error  specified  in  the  text  and  represented  by  the  gray 
asterisks  in  Fig.  2.  The  result  shows  that  the  TLM  diverges  from 
the  background  solution  after  about  1 1.4  time  units. 


rurs  that  force  the  TLM  are  needed  for  testing  pur¬ 
poses,  Thus,  an  unperturbed  nonlinear  solution  was 
used  as  the  background.  For  the  assimilation  experi¬ 
ments  discussed  in  this  section,  the  background  also 
serves  as  the  field  to  be  corrected.  In  this  case,  the 
background  is  obtained  by  solving  the  nonlinear  model 
and  adding  the  selected  errors  as  perturbations  in  order 
to  simulate  a  solution  that  is  significantly  different  from 
the  data.  An  additional  experiment  will  introduce  a 
model  error  through  the  perturbation  of  the  dynamical 
model  parameters, 

a.  The  inaccuracy  of  the  global  solution 

Figure  4  compares  the  global  solution  to  the  back¬ 
ground  and  true  solutions  (note  that  the  data  are 
sampled  from  the  true  solution  every  0.25  time  units 
except  at  t  —  0  and  t  —  20),  The  global  solution  attempts 
to  correct  l he  perturbed  background  by  assimilating  all 
data  over  the  entire  lime  frame  |0,  20]  using  the  direct 
represemer  method  with  one  outer  loop.  Even  though 
the  lime  frame  of  assimilation  is  far  greater  than  the 
stability  of  the  TLM.  the  solution  is  able  to  track  the 
data  somewhat  for  about  seven  lime  units.  One  slum  Id 
notice  that  whenever  the  discrepancy  between  the 
background  and  the  data  increases,  the  global  solution 
follows  the  background  due  to  the  relatively  small  co- 
variance  of  the  model  error,  li  can  he  seen  from  Fig.  5 
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Fig,  4  The  assimilated  solution  (solid  black  line)  computed 
usin £  i he  direct  represent er  method  over  the  entire  time  period 
(20  time  units)  goes  unstable  after  about  7  time  units.  The  data  are 
sampled  from  the  true  solution  (gray  line)  every  Va  time  unit  and 
are  assimilated  in  an  attempt  to  correct  the  wrong  background 
(dotted  line).  The  background  is  computed  in  the  same  fashion  as 
in  Fig.  3,  except  instead  of  the  TL.M  being  perturbed,  the  back¬ 
ground  is  perturbed  by  the  specified  initial  condition  and  model 
error. 


that  the  global  solution  is  able  to  reduce  the  prior  mis¬ 
fits  significantly  (even  beyond  the  time  range  of  accu¬ 
racy  of  the  TLM)  before  losing  track  of  the  data*  Be¬ 
yond  seven  time  units,  the  misfit  between  assimilated 
solutions  and  data  grows  rapidly  and  can  be  attributed 
to  the  increasing  errors  in  the  TLM  approximation. 
One  can  therefore  conjecture  lhat  the  error  growth  in 
the  TLM  can  be  limited  by  reducing  the  length  of  the 
assimilation  window.  This  provides  the  incentive  for 
reducing  the  assimilation  window  and  using  the  cycling 
re  presenter  method. 

h.  The  advantage  of  the  cycling  re  presenter 

approach 

In  this  section,  the  cycling  representer  method  is 
implemented  for  the  same  assimilation  problem  by  sub¬ 
dividing  the  assimilation  window  into  cycles  of  equal 
length.  This  is  motivated  by  ihe  reduced  range  of  accu¬ 
racy  of  the  TLM  and  the  ability  of  the  assimilation 
method  to  adjust  to  the  data  beyond  lhai  range.  There 
are  three  clear  advantages  that  one  can  foresee  in  this 
approach:  (i)  a  shorter  assimilation  window  will  limit 
the  growth  of  errors  in  the  TLM,  (ii)  the  background 
for  the  next  cycle  w  ill  be  improved,  and  (iii)  the  overall 
computational  cost  is  reduced*  h  is  assumed  that  the 
assimilation  in  the  current  cycle  will  improve  the  esti¬ 
mate  of  the  state  at  the  final  time.  The  ensuing  forecast 


Fit*.  5.  RMS  misfits  between  the  data  and  the  background  (solid 
line)  and  assimilated  (dotted  line)  solutions  for  the  first  7  time 
units  in  Fig.  4.  This  plot  reveals  that  even  though  the  TLM  is  only 
reliable  for  about  0.4  time  units  (see  Fig.  3),  the  assimilated  so¬ 
lution  is  stable  for  about  seven  lime  units  and  is  correcting  the 
background  toward  the  data  during  Ibis  time  period* 


(the  solution  of  the  nonlinear  model  propagated  from 
the  final  state)  is  a  better  background  for  t  he  next  cycle 
than  the  corresponding  portion  of  the  background  used 
in  the  global  solution.  The  same  random  error  is  added 
Lo  the  dynamics  of  l he  nonlinear  forecast  of  each  cycle 
as  w^as  added  to  the  corresponding  portion  of  the  back¬ 
ground  used  in  the  global  assimilation.  Except  for  the 
first  cycle,  the  initial  condition  of  each  forecast  is  per¬ 
turbed  by  the  model  error  instead  of  the  initial  condi¬ 
tion  error.  The  chaotic  behavior  of  the  model  and  the 
random  error  will  cause  a  divergence  (that  grows  wriih 
the  cycle  length)  between  the  observations  and  the 
background.  All  experiments  that  follow  use  the  same 
model  error  specifications  as  used  in  the  global  assimi¬ 
lation*  and  unless  noted  otherwise,  ihe  direct  re  pre¬ 
senter  method  is  applied  with  a  single  outer  loop,  which 
is  exactly  what  was  done  for  the  global  solution. 

The  first  cycling  representer  experiment  uses  2  cycles 
of  10  time  units  each.  The  results  in  Fig.  6  show  that  the 
solution  is  drastically  improved  compared  to  the  global 
solution  in  Fig,  4*  The  first  thing  one  observes  is  the 
disappearance  of  the  large  values  of  ihe  global  solution 
in  the  Iasi  2/3  of  the  assimilation  window.  The  analyzed 
solution  in  both  cycles  clearly  benefits  from  a  shorter 
assimilation  window  limiting  error  growth  in  the  TLM, 
and  ihe  second  cycle  has  the  extra  advantage  of  using  a 
better  estimate  of  the  background.  The  misfits  to  the 
data  are  decreased,  although  they  still  exceed  the  ob¬ 
servation  standard  deviation  in  many  portions  of  the 
trajectory. 
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Fjg.  6.  Same  :is  in  Fig,  4  except  Rial  here  I  he  assimilation  is 
separated  into  two  cycles,  where  the  final  assimilated  solution  of 
the  first  cycle  (at  t  =  10  lime  units)  is  used  as  the  initial  conditions 
for  the  computation  of  the  background  tn  the  second  cycle.  The 
background  for  the  first  cycle  is  perturbed  with  both  initial  con¬ 
dition  and  model  error,  whereas  for  the  second  cycle  the  back¬ 
ground  is  only  perturbed  with  model  error  By  performing  two 
cycles  of  the  direct  reprcscnicr  method  instead  of  one  (Fig.  Ti.  the 
assimilated  solution  is  now  stable  for  the  entire  time  period. 


The  results  in  Fig.  7  show  the  RMS  error  between  the 
truth  and  the  assimilated  solution  with  respect  to  time 
for  cycle  lengths  of  L  2T  5.  and  10  lime  units.  It  is  shown 
that  the  RMS  error  increases  with  the  cycle  length.  This 
is  to  be  expected  since  longer  cycles  violate  the  TLM 
accuracy  criterion.  In  other  words,  the  steady  decrease 
of  RMS  error  with  respect  to  the  cycle  length  indicate 
that  as  the  latter  approaches  the  TLM  stability  time  for 
the  range  of  perturbations  given  by  the  adjoint  model, 
the  assimilation  algorithm  is  better  able  to  fit  the  data. 

t\  The  cycling  indirect  represented  approach 

In  realistic  atmospheric  and  ocean  models  it  is  not 
possible  to  compute  all  the  re  presen  ter  functions  and 
implement  the  direct  re  presenter  algorithm.  In  this  sec- 
lion  the  indirect  reprcscnicr  algorithm  is  implemented 
(A mode i  1995;  Egbert  et  al.  1994:  C'hua  and  Bennett 
2001:  also  see  the  appendix).  In  the  first  attempt  to 
apply  the  indirect  approach  a  cycle  length  of  one  time 
unit  is  used.  Unlike  the  direct  representer  approach 
where  an  exact  matrix  inversion  is  used  to  compute  the 
re  presen  ter  coefficients,  the  indirect  approach  uses  an 
iterative  solver  (i.e.,  ihe  CGM).  The  COM  does  nol 
require  the  entire  matrix  to  be  inverted:  rather  it  iter¬ 
ates  through  search  directions  in  data  space.  The  stop¬ 
ping  criterion  of  the  CGM  is  either  (i)  when  the  relative 
norm  of  l he  residual  is  less  than  10  \  or  (ii)  when  the 
number  of  iterations  equals  the  number  of  measure¬ 


ments  within  the  cycle.  For  well-conditioned  systems, 
the  first  stopping  criterion  is  typically  reached  lirsi 
therefore  requiring  less  computation  time  compared  to 
the  direct  method. 

The  first  remark  in  ihis  experiment  is  that  the  repre¬ 
senter  matrix  for  the  firsi  cycle  assimilation  is  ilj  con¬ 
ditioned  (condition  number  2.3  x  1(F)  due  to  the 
rather  high  variance  of  the  initial  condition  errors.  As  a 
consequence,  the  CGM  converges  very  slowly  and  does 
not  reach  the  first  stopping  criterion.  Since  proper  con¬ 
vergence  is  scarcely  reached  within  the  lirsi  cycle,  outer 
loops  are  required  in  order  to  achieve  an  acceptable 
solution.  In  this  experiment,  four  outer  loops  are  ap¬ 
plied  to  each  cycle  (Fig.  Rb),  This  assimilated  solution 
tracks  the  data  well  and  compares  fairly  well  with  the 
direct  approach  (Fig.  8a),  albeit  for  a  higher  computa¬ 
tional  cost.  Since  ihe  indirect  method  seems  lo  only 
have  difficulties  in  the  first  cycle,  an  experiment  is  per¬ 
formed  where  four  outer  loops  are  only  applied  in  the 
first  cycle,  and  a  single  outer  loop  is  applied  in  subse¬ 
quent  cycles  (Fig.  8c).  This  solution  is  less  expensive 
than  l be  previous  and  its  accuracy  and  cost  is  compa¬ 
rable  to  the  direct  method.  The  reason  lhat  this  ap¬ 
proach  performs  well  is  because  it  is  important  to  per¬ 
form  an  accurate  assimilation  in  the  first  cycle  in  order 
lo  obtain  a  good  fit  to  the  data  and  correct  the  back¬ 
ground  containing  the  initial  condition  error.  This  im¬ 
proved  state  estimate  provides  a  good  initial  condition 
and  thus  background  for  the  next  cycle  and  so  forth,  h 
should  be  emphasized  here  that  the  proposed  method 
does  not  require  or  rely  on  the  ability  lo  obtain  a  good 
analysis  in  the  first  cycle,  although  it  is  a  desirable  out- 
co  me.  The  additional  o  u  l  e  r  l  oops  in  t  h  e  f  i  rs  i  c  vc  1  e  we  re 
used  in  the  context  of  the  indirect  representer  ap¬ 
proach.  in  order  lo  overcome  the  slow  eon  verge  nee  of 
the  conjugate  gradient  method  due  to  ihe  poor  condi¬ 
tioning  of  the  representer  matrix.  If  the  lirsi  cycle  fails 
to  give  a  good  analysis,  it  still  will  minimize  to  some 
extent  the  discrepancy  between  ihe  background  and  the 
data  at  the  final  lime,  which  will  yield  a  belter  back¬ 
ground  for  the  next  cycle.  As  ibis  process  is  repeated, 
ihe  system  will  fit  the  data,  as  illustrated  in  Figs,  7  and 
8.  The  lesson  learned  here  is  that  in  cycling  Ihe  indirect 
representor  method,  only  the  first  few  cycles  may  ne¬ 
cessitate  ouler  loops. 

It  should  also  be  mentioned  that  when  the  length  of 
the  cycle  is  reduced  to  0,5  lime  units,  which  is  roughly 
the  stability  time  for  the  TLM  in  the  first  cycle ,  the 
CGM  converges  rather  well  and  there  is  no  need  for 
multiple  outer  loops.  Phis  result  is  shown  in  Fig.  9  and 
implies  that  the  indirect  method  works  satisfactorily 
well  without  ouler  loops  when  ihe  cycle  length  is  at 
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hit  -  1,  RMS  of  the  misfit  between  assimi lilted  Mnd  true  solutions  using  different  numbers  of  cycles:  (;i)  20.  (b ) 
III.  (c)  5,  and  (si)  2  cycles.  The  cycle  boundaries  are  depicted  by  dashed  lines.  By  increasing  the  number  of  cycles 
from  2  (Fig.  7d  is  the  misfit  of  Fig.  6)  to  20  (Fig.  7a).  a  significant  improvement  in  the  assimilated  solution  is 
achieved. 


about  the  TLM  accuracy  range.  OF  course,  this  conclu¬ 
sion  would  have  to  be  tested  with  realistic  models, 

d.  The  east 

One  major  reason  why  the  represenier  method  is  not 
widely  implemented  is  the  perceived  computational 
cost.  The  biggest  reduction  in  cost  is  achieved  by  lim¬ 
iting  ihe  outer  loops  to  one,  as  was  mentioned  above. 
Further  gains  in  computational  cost  are  obtained  by 
cycling  the  representer  method.  Assume  that  the  matrix 
inversion  in  the  direct  method  is  performed  with  a  cost 
of  0(Aflog  M)  for  computing  M  representer  coeffi¬ 
cients*  where  M  is  the  number  of  measurements.  The 
cycling  approach  total  cost  will  be  Ncy  x  G(Mcy log 
Mcy),  w  here  Nty  is  the  number  of  cycles  and  Mcy  is  the 
number  of  measurements  within  each  cycle  (assuming 
that  the  measurements  are  uniformly  distributed  in  the 
assimilation  interval).  Although  Ncy  X  Mcy  =  M,  log 
;V/LS  gets  exponentially  smaller  with  increasing  Ncy, 
thus,  decreasing  the  computational  cost  as  illustrated  in 
Table  1.  I  lowever,  there  is  a  drawback  to  reducing  the 
cycle  length.  The  data  influence  is  extended  beyond  ihe 
cycle  interval  only  through  an  improved  initial  condi¬ 
tion  for  the  next  cycle.  Future  data  contained  in  subse¬ 
quent  cycles  will  not  contribute  to  the  assimilation  in 
the  current  and  past  cycles.  One  should  keep  this  in 


mind,  as  well  as  the  time  decorrelation  scale  of  the 
model  errors,  in  choosing  the  appropriate  cycle  length. 

The  cost  reduction  in  the  indirect  method  is  less  ob¬ 
vious.  t  lowever,  for  well -conditioned  representer  ma¬ 
trices,  the  indirect  method  typically  converges  in  a 
number  of  iterations  that  is  a  fraction  of  the  total  num¬ 
ber  of  measurements.  That  is,  the  cost  of  the  indirect 
method  is  a  fraction  of  the  cost  of  the  direct  method. 
Therefore,  since  cycling  and  avoiding  the  outer  loops 
reduce  the  cost  of  the  representer  method  in  the  direct 
method,  they  will  further  reduce  the  cost  in  the  indirect 
method.  Given  the  difficulties  encountered  by  the  indi¬ 
rect  method  in  the  first  cycle  of  one  time  unit,  longer 
cycles  were  not  implemented.  The  computational  cost 
of  the  20-cycle  experiment  with  the  conjugate  gradient 
using  4  outer  loops  in  the  first  cycle  and  l  outer  loop  in 
the  subsequent  cycles  is  1.27  s,  just  slightly  higher  than 
the  cost  of  the  direct  method,  the  excess  cost  being 
attributed  to  the  outer  loops  in  the  first  cycle. 

e.  Strong  constraint  cycling  versus  weak  constraint 
cycling 

The  assimilation  problem  can  be  solved  using  the 
strong  constraint  re  presen  ter  approach.  Here  the 
strong  and  weak  constraint  solutions  are  compared 
within  ihe  same  setting.  From  the  known  sensitivity  of 
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1-up  S.  RMS  of  the  misfit  between  assimilated  and  true  solutions 
using  20  cycles.  Assimilated  solutions  arc  computed  using  (a)  the 
direct  represen  ter  method  (same  as  in  I  ig.  7a),  (b)  the  indirect 
conjugate  gradient  method  using  lour  outer  loops  for  each  cycle, 
and  (c)  the  indirect  conjugate  gradient  method  using  four  outer 
loops  only  for  the  first  cycle.  Also  displayed  in  this  figure  arc  the 
times  required  for  each  compulation.  The  indirect  approach  (Fig. 
.Sc)  is  a  more  applicable  method  and  has  about  the  same  cost  as 
the  direct  method  (Fig.  Sa).  the  indirect  approach,  however,  takes 
about  twice  as  long  for  the  assimilated  solution  to  reach  an  accu¬ 
rate  level. 


the  Lorenz  model  to  initial  conditions,  the  question  one 
tries  to  address  by  the  strong  constraint  approach  is 
whether  it  is  possible  to  accurately  fit  the  data  within 
the  cycle  by  correcting/com  rolling  only  the  initial  con¬ 
dition,  How  short  does  a  cycle  need  to  be  to  achieve  an 
accurate  solution?  The  strong  constraint  solution  is  ob¬ 
tained  by  the  same  procedure  as  the  weak,  except  that 
the  model  error  covariance  is  set  to  zero.  In  comparison 
to  Fig.  7,  results  in  Fig,  10  reveal  that  the  weak  con¬ 
straint  solution  is  not  only  more  accurate,  but  can  also 
afford  longer  cycles  than  the  strong  constraint.  The 


Piii.  9,  RMS  of  the  misfit  between  assimilated  and  true  solutions 
using 40  equal-length  cycles.  The  assimilated  solution  is  computed 
using  the  indirect  conjugate  gradient  method  with  a  single  outer 
loop  applied  to  all  cycles. 


Tabi.F  I,  Computational  cost  o(  the  global  and  cycling  solutions 
using  the  direct  method  with  a  single  outer  loop. 


2  4  5  Itl  20 

Global  cycles  cycles  cycles  cycles  cycles 

Time  (s)  21.37  9.33  4,49  32>9  I.9U  1.09 


strong  constraint  is  almost  confined  to  the  11  M  validity 
lime,  and  needs  quite  a  lew  cycles  to  star!  matching  the 
data.  In  the  experiment  with  cycles  of  two  time  units, 
the  weak  constraint  accurately  fits  the  data  after  three 
cycles,  bul  the  strong  constraint  never  does.  When  the 
cycle  length  is  decreased  to  one  time  unit,  the  weak 
constraint  fits  the  data  in  the  second  cycle  and  after¬ 
ward,  while  the  strong  constraint  starts  lilting  the  data 
only  in  the  15th  cycle.  The  computations  here  show  the 
superior  performance  of  the  weak  const  rain!  approach. 
The  weak  constraint  approach,  however,  comes  with  a 
higher  computation  cost.  For  these  experiments,  the 
computational  cost  of  the  strong  constraint  is  about  half 
of  that  for  the  weak  constraint.  The  reason  for  i  his  large 
difference  in  cost  is  that  the  number  of  operations  to 
propagate  the  Lorenz  attractor  is  relatively  small  and  is 
about  equal  to  the  number  of  operations  required  to 
convolve  the  model  errors  in  space  and  time.  For  larger 
operational  models,  this  computational  cost  difference 
would  be  significantly  smaller.  Both  approaches  could 
perform  well  in  operational  settings  if  they  are  given 
sufficient  spin  up  cycles,  less  for  the  weak  constraint. 

/  Assimilating  only  the  x  variable 

It  is  unlikely  that  in  realistic  applications  all  compo¬ 
nents  of  a  dynamical  model  can  be  observed  or 
sampled.  For  example,  the  majority  of  measurements 
taken  in  the  ocean  are  from  the  surface.  I  n  mimic  i his 
scenario,  an  experiment  is  carried  out  where  only  the  \ 
component  of  the  Lorenz  model  is  sampled  at  the  same 
frequency  of  the  0.25  time  unit.  The  cycling  representer 
algorithm  with  cycles  of  one  time  unit  and  one  outer 
loop  is  used  to  assimilate  these  data.  Results  in  Fig.  1 1 
show  that  the  method  is  able  to  fit  the  data  satisfacto¬ 
rily,  albeit  for  a  longer  adjustment  or  spin  up  time.  Com¬ 
pared  to  Fig,  7a,  it  takes  five  additional  cycles  lor  the 
assimilation  to  start  matching  the  data.  Therefore,  a 
decrease  in  the  total  number  of  assimilated  observa¬ 
tions  and  the  fact  lhat  only  one  state  variable  is  sampled 
are  not  limiting  factors  for  the  proposed  algorithm, 

g.  First  cycle  accuracy  as  a  function  of  initial 
condition  error  variance  and  outer  loops 

As  mentioned  above,  the  proposed  algorithm  does 
not  solely  rely  on  the  ability  to  obtain  a  good  analysis  in 
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Fir*.  10.  Same  as  in  Fig.  7,  except  that  the  strong  constraint  represenicr  method  is  used  instead  of  the  weak 
constraint.  Comparison  of  this  figure  with  Fig.  7  shows  the  benefit  of  using  weak  constraint  to  assimilate  data  with 
strongly  nonlinear  dynamic  systems. 


the  first  cycle.  Since  the  Lorenz  model  is  extremely 
sensitive  to  initial  perturbations,  a  significantly  large 
error  was  prescribed  to  the  initial  condition  to  test  the 
robustness  of  the  algorithm-  It  was  shown  in  Figs.  8b, c 
that  comparable  accuracy  can  be  obtained  using  either 
outer  loops  in  every  cycle  or  just  in  the  first  cycle.  An 
experiment  is  presented  here  that  displays  the  accuracy 
of  the  assimilation  after  one  time  unit  as  a  function  of 
the  magnitude  of  the  initial  perturbations  and  the  num¬ 
ber  of  outer  loops.  The  results  in  Fig.  12  show  that  the 
accuracy  of  the  assimilated  solution  at  the  end  of  the 
first  cycle  decreases  as  the  RMS  of  the  initial  condition 
error  increases,  which  is  to  be  expected,  and  that  the  same 
accuracy  increases  with  the  number  of  outer  loops  used. 

For  initial  condition  RMS  errors  below  20%,  one 
outer  loop  is  sufficient  to  achieve  an  accurate  solution 
at  the  end  of  the  first  cycle  (i.e.,  the  data  misfit  is  less 
than  the  data  error).  When  the  initial  condition  RMS 
error  is  between  20%  and  40%,  two  outer  loops  are 
necessary.  Three  outer  loops  are  needed  when  the 
RMS  is  between  40%  and  45%,  and  beyond  45%  at 
least  five  outer  loops  are  needed.  The  initial  condition 
errors  selected  in  section  2  and  used  in  the  assimilation 
experiments  (Fig,  2a)  have  an  RMS  error  of  85%, 
which  according  to  Fig.  12,  would  require  at  least  five 
outer  loops  to  obtain  an  accurate  solution  at  the  end  of 
the  first  cycle.  However,  using  these  initial  condition 


errors  with  one  outer  loop  (Fig,  7a)  produced  an  accu¬ 
rate  assimilated  solution  after  two  cycles.  Therefore, 
one  must  determine  if  it  is  worth  the  computational  cost 
to  perform  additional  outer  loops  to  improve  the  accu¬ 
racy  of  the  first  cycle,  knowing  that  it  may  not  signifi¬ 
cantly  influence  the  accuracy  of  subsequent  cycles. 


I- Hi.  3 1.  RMS  of  the  misfit  between  assimilated  and  true  solu¬ 
tions  using  only  the  _v  observations  sampled  every  (1.25  time  unit 
and  the  direct  cycling  represenicr  algorithm  with  2(1  cycles  and 
one  outer  loop. 
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Fir..  12.  RMS  of  posterior  misfit  at  the  end  of  the  first  cycle  a s 
a  function  of  outer  loops  and  the  initial  condition  RMS  error. 


h.  Model  error  through  perturbed  parameters 

The  model  error  described  in  section  2  consists  of 
perturbations  of  the  right-hand  side  of  the  model  while 
keeping  the  same  dynamics  that  generated  the  obser¬ 
vations.  Another  possible  method  is  to  generate  the 
model  error  by  perturbing  the  parameters  in  the  dy¬ 
namical  model.  Contrary  to  the  previous  experiments, 
perturbing  the  parameters  in  the  model  ensures  that  the 
dynamics  used  in  the  assimilation  are  different  from  the 
dynamics  that  generated  the  data.  The  perturbation  of 
the  model  parameters  is  similar  to  that  of  the  initial 
conditions  described  in  section  2.  First  the  RMS  of  pa¬ 
rameters  is  computed,  and  the  perturbation  is  a  per¬ 
centage  of  that  RMS  multiplied  by  a  normally  distrib¬ 
uted  random  number.  For  each  selection  of  the  pertur¬ 
bation  magnitude,  1000  realizations  of  the  random 
number  are  used  in  separate  assimilation  experiments, 
generating  a  1000-mcmber  ensemble  of  assimilated  so¬ 
lutions.  In  Fig.  13,  the  mean  of  this  ensemble  of  poste¬ 
rior  misfits  at  the  end  of  the  first  cycle  is  shown  as  a 
function  of  the  perturbation  magnitude  and  the  outer 
loops.  As  the  parameter  perturbation  magnitude  varies 
from  0,1%  to  10%,  there  is  only  a  slight  increase  in  the 
posterior  misfits  at  the  end  of  the  first  cycle  in  the  first 
outer  loop.  Figure  13  indicates  that  the  misfits  decrease 
as  more  outer  loops  are  used,  with  the  biggest  decrease 
in  the  second  and  third  outer  loops.  However,  at  least 
five  outer  loops  are  needed  to  obtain  an  accurate  solu¬ 
tion  at  the  end  of  the  first  cycle.  After  the  first  outer 
loop  the  misfits  arc  fairly  constant  for  perturbations 
between  2%  and  10%  in  each  subsequent  outer  loop, 

5.  Discussion 

The  implications  of  the  proposed  algorithm  to  the 
real  world  of  oceanography  and  meteorology  are  im¬ 
mediate.  The  dimensionless  time  (t)  in  the  Lorenz 
model  is  related  to  a  simplified  one -layer  atmospheric 
model  time  (f)  by  r  =  ttH  I  +  u2)kl  where  a 2  =  0.5, 
//  is  the  depth  of  the  fluid,  and  k  is  the  conductivity. 


RMS  error  of  cwfliciurti  iperctnliflft  of  in#  vujupi 


Ftci.  13.  RMS  of  posterior  misfit  at  the  end  of  the  lirsl  cycle  as 
a  function  of  the  parameter  error  tin  terms  of  percentage  of  true 
value)  and  outer  loops. 

Therefore,  for  a  fluid  depth  of  500  m  and  a  conductivity 
of  25  x  10  \  a  time  unit  in  the  Lorenz  model  corre¬ 
sponds  to  7.8 1 8  days.  The  TLM  stability  range  depends 
on  the  resolved  processes  and  scales,  the  model  resolu¬ 
tion,  and  of  course,  the  nonlinear  interactions  within 
the  model  dynamics.  For  example,  the  TLM  of  a  gen¬ 
eral  circulation  ocean  model  forced  by  climatological  or 
monthly  means  atmospheric  fluxes  and  designed  to  re¬ 
solve  seasonal  to  interannual  variability  in  the  tropical 
Pacific  will  be  stable  for  months  (Ngodoek  el  al.  2000). 
whereas  the  TLM  of  a  fine  resolution  coastal  circula¬ 
tion  ocean  model  forced  with  synoptic  atmospheric 
fluxes  with  complex  bottom  topography  may  not  be 
stable  beyond  a  week  or  two.  In  the  latter  case  any 
hindcast  experiment  using  the  represenier  method  for 
time  ranges  longer  than  a  month  will  benefit  from  the 
algorithm  proposed  in  this  study.  Additionally,  the 
outer  loops  may  not  be  restricted  to  the  first  cycle  as  is 
the  case  here  with  the  Lorenz  model.  They  can  he 
turned  on  as  needed.  In  real-world  applications,  even 
though  the  assimilation  system  has  been  spun  up  over 
several  cycles  and  only  one  outer  loop  sufficed  for  the 
previous  cycle,  there  may  arise  a  situation  where  non- 
linearities  become  stronger  than  usual,  causing  the 
background  in  the  current  cycle  (i,e„  the  forecast  from 
the  previous  cycle)  to  deviate  significant  I  y  far  from  the 
data. 

The  proposed  algorithm  will  therefore  be  suitable 
both  for  reanalysis  computations  and  cycles  of  analysis 
and  predictions  (e,g.,  in  coastal  ocean  monitoring). 


The  impact  of  TLM  accuracy  in  a  cycling  represenier 
implementation  is  examined  using  the  Lorenz  at  tractor. 
Errors  in  the  initial  conditions  and  dynamics  of  the 
Lorenz  attractor  are  assumed  and  added  to  the  back¬ 
ground,  The  error  estimates  used  here  indicate  that  the 
TLM  about  this  background  is  accurate  up  lo  0.4  time 
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units.  Far  the  Lorenz  equations,  weak  constraint  as¬ 
similation  globally  over  a  20  time  unit  period  is  not 
possible  because  of  the  inaccuracy  and  instability  of  the 
TLM  over  such  a  long  lime  period*  Experiments  using 
successively  shorter  time  intervals  indicate  that  a  weak 
constraint  cycling  representer  assimilation  is  able  to 
provide  an  accurate  solution  with  cycle  intervals  of  two 
lime  units  or  less* 

For  the  problems  examined  here,  it  is  found  that  once 
a  sufficiently  accurate  initial  condition  is  provided  for  a 
cycle,  the  TLM  is  sufficiently  accurate  in  subsequent 
cycles  therefore  neglecting  the  need  for  outer  loops  to 
perform  the  nonlinear  minimization.  For  the  Lorenz 
attractor  system,  outer  loops  are  required  for  the  first 
cycle  and  not  for  subsequent  cycles. 

The  representer  method  is  used  for  the  4DVAR 
problem  with  the  Lorenz  model  in  various  settings;  the 
direct  approach  for  both  global  and  cycling  assimila¬ 
tions,  and  the  indirect  approach  for  the  cycling.  The 
first  guess  is  taken  as  the  nonlinear  background  solu¬ 
tion  that  serves  for  the  linearization  in  the  TLM,  The 
cycling  approach  is  motivated  by  the  limited  stability  of 
the  TLM  and  the  potential  operational  applications. 
For  the  selected  set  of  initial  and  model  errors,  the 
global  assimilation  solution  could  track  the  data  for 
about  seven  lime  units.  After  this  time,  the  solution 
rapidly  diverges  from  the  data  owing  to  unbounded  er¬ 
ror  growth  in  the  TLM*  The  cycling  solution  is  more  apt 
to  fit  the  data,  especially  with  small  cycle  lengths,  be¬ 
cause  the  error  growth  in  the  TLM  is  limited.  The  iter¬ 
ated  indirect  representer  method  was  then  imple¬ 
mented  via  the  CGM.  A  slow  convergence  of  the  CGM 
was  observed  in  the  first  cycle,  due  to  the  poor  condi¬ 
tioning  of  the  representer  matrix.  By  applying  four 
outer  loops  to  this  first  cycle,  a  sufficiently  accurate 
assimilation  was  achieved.  When  the  cycle  length  was 
reduced  to  0.5  time  units,  which  is  about  the  lime  range 
of  the  TLM  accuracy,  the  indirect  approach  converged 
rather  well  without  outer  loops.  In  addition,  these  weak 
constraint  assimilation  experiments  were  shown  to  sig¬ 
nificantly  outperform  their  strong  constraint  counter¬ 
parts. 

This  paper  shows  that  cycling  the  representer  method 
for  strongly  nonlinear  problems  is  a  valuable  assimila¬ 
tion  tool.  The  cycle  length  could  he  slightly  longer  than 
the  lime  range  of  accuracy  of  the  TLM.  The  indirect 
representer  algorithm  may  or  may  not  require  outer 
loops  in  the  early  cycles  depending  on  the  length  of  the 
cycle.  Once  the  assimilated  solution  becomes  suffi¬ 
ciently  accurate  in  the  estimation  of  the  final  state  for 
the  early  cycles,  the  need  of  outer  loops  is  removed. 
This  significantly  reduces  the  cost  of  the  weak  con¬ 


straint  variational  assimilation,  in  addition  to  the  reduc¬ 
tion  obtained  by  cycling. 

The  robustness  of  the  proposed  algorithm  is  demon¬ 
strated  by  its  ability  to  recover  the  true  solution  after 
few  spinup  cycles  in  a  series  of  separate  experiments: 
the  number  of  measurements  was  decreased  and  only 
the*  component  of  the  Lorenz  model  was  sampled,  the 
initial  condition  error  magnitude  was  varied,  and  finally 
the  model  error  was  introduced  by  perturbing  the  pa¬ 
rameters  in  the  dynamical  model.  In  these  settings  ad¬ 
ditional  outer  loops  were  necessary  to  achieve  an  accu¬ 
rate  solution  at  the  end  of  the  first  cycle.  I  lowever,  this 
study  clearly  show's  that  dropping  the  outer  loops  was 
not  detrimental  to  the  algorithm*  even  with  the  rather 
pessimistic  choice  of  initial  condition  and  model  errors. 
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APPENDIX 

Solving  the  Linear  EL  System  Using  the 
Representer  Method 

Given  a  background  field  x1*  the  linear  EL  to  be 
solved  is 


f/A 


(/Fix')]1 

dx 


A  -  H 1  wid  -  Hxt 


A(D  =  0 


(Ai) 


and 


_  f.  d¥ixf)  „ 

^  =  F(*0  +  ^<*-*')  +  (Va 


x(0)  =  x  +  C„A(0>. 


( A2) 


The  representer  expansion  for  uncoupling  Eqs.  (AI)- 
( A2)  is 


M 

i(f)  =  xf(t)  +  ^  a,„r,MUl  (A3) 

trt  1 

Here  the  background  (i.e.*  the  trajectory  around  which 
the  model  is  linearized)  is  also  taken  as  the  first  guess 
(the  solution  that  the  assimilation  will  correct).  The 
representer  functions  tm.  m  =  1.  . . ,  M  are  computed 
from 
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Am  =  o 


(/Fix') 

dx 


Am  -  H'5(/  -  /,rtJ 


( A4) 


and  l he  adjoint  of  the  Loren/,  model  is  given  by 

d\*  ,  r 

-  —  =  “trA1  +  pA1  -  rrXv  -  y'Ar 

rfA* 

—  —T—  —  jrA1  -  A1  T  a  A 

dt 


and 

dfm  d  F(V) 

ur =  <i\  +  Q,,‘< ' Km 

rJO)  =  C,-Am<0).  (A3) 

It  may  be  shown  (e.g.,  Bennett  2002)  that  the  repre¬ 
sentor  coefficients  atfr  tn  -  I , . .  *  M ,  in  Eq.  (A3)  are  the 
solution  of  the  linear  system: 

[FT  +  w  ']<*  =  d  -  Hx'.  ( A6) 

where  R'  is  the  re  presenter  matrix,  obtained  by  evalu¬ 
ating  the  representer  functions  at  the  measurements 
sites  {i.e.,  the  wth  column  of  R(  is  HrWJ).  In  practice, 
solving  Eq.  (A6)  does  not  require  the  computation  of 
the  entire  representor  matrix.  An  iterative  method  such 
as  the  conjugate  gradient  may  be  invoked*  as  long  as  the 
matrix-vector  product  on  the  left-hand  side  of  Eq.  ( A6) 
can  be  computed  for  any  vector  in  the  data  space.  This 
is  made  possible  through  the  indirect  representer  algo¬ 
rithm  (Amodei  1995;  Egbert  el  al.  1994),  which  is  also 
used  to  assemble  the  right-hand  side  of  Eq.  (A3)  with¬ 
out  the  explicit  computation  and  storage  of  the  repre¬ 
senter  functions.  Specifically,  given  a  vector  y  in  the 
data  space,  the  product  R  y  is  obtained  by  solving  Eqs. 
( A4)-(  A5)  with  y  replacing  the  Dirac  della  in  the  right- 
hand  side  of  Eq.  (A4),  then  applying  the  observation 
operator  H  to  the  resulting  r  Once  the  representer  co¬ 
efficients  a  are  obtained,  the  optimal  residuals  are  com¬ 
puted  by  solving  Eq.  ( A4),  where  the  single  Dirac  delta 
function  is  now  replaced  by  the  linear  combination 

'  U)«  These  residuals  are  then  used  in  the 
right-hand  side  of  Eq,  (A5)  to  compute  the  optimal 
correction  to  the  first  guess  xL 

Given  a  background  trajectory  .v'(0,  y;UL  and  z7(/), 
[Eq,  (l)]  the  system  is  tangent  linearized  as 

dx 

(tv  .  ,  , 

—  =  p-v  -  v  -  x'z  -  (-v  -  x1  k 

—■  =  x ry  -  (.v  -  -v r)yf  -  fiz.  (AT) 


-  —  =  -xf\y  -  0A\  ( A8) 

at 

REFERENCES 

Amodei,  L.,  1995:  Solution  approchcc  pour  un  problems 
d’assimilalion  dc  donrtfies  avee  prise  en  conipte  tie  Perrem  du 
modcle.  Comp.  Rend.  Acad.  Set .,  32 J  (Hah  1087  1094, 

Bennett,  A,  F.,  1992:  Inverse  Methods  in  Physical  Oceanography. 
Cambridge  University  Press,  347  pp, 

— ,  2002:  Inverse  Modeling  of  the  Ocean  and  Atmosphere.  Cam¬ 
bridge  University  Press,  320  p p. 

- .  B,  S,  Chua.  and  L.  M.  Leslie,  1 9%:  Generalized  inversion  of 

a  global  numerical  weather  prediction  model.  Meteor.  Atmos, 
Rhys.,  60,  165-178. 

Chun,  B,  S.,  and  A.  F.  Bennett,  2001:  An  inverse  ocean  modeling 
system.  Ocean  ModelL  3,  137-165. 

Egbert,  G.  D..  A.  F.  Bennett,  and  M.  G.  G.  Foreman,  1994: 
TOPEX/PGSEIDON  tides  estimated  using  a  global  inverse 
method.  J.  Geophys.  Res..  99,  24  821-24  852. 

Even  sen,  G.,  1997:  Advanced  data  assimilation  for  strongly  non¬ 
linear  dynamics,  Mom  Wen,  Rev.,  125,  1342-1354. 

- .  and  N.  Fa rio,  1997:  solving  for  the  generalized  inverse  of  the 

Lorenz  model,  J,  Meteor.  Soc.  Japan,  75  (IB},  229  243. 

- ,  and  P,  J.  Van  Lecuwen,  2000:  An  ensemble  Kalman  smoother 

for  nonlinear  dynamics.  Mon.  Wtt l  Rev.,  128,  1852-1807. 

Gauthier.  P„  1992:  Chaos  and  quadric-dimensional  data  assimila¬ 
tion:  A  study  based  on  the  Lorenz  model.  Te/hts,  44A„  2  17. 

Jacobs,  G.  A,,  and  1  L  0,  Ngodock,  2003:  The  maintenance  of  con¬ 
servative  physical  laws  within  data  assimilation  systems.  Mon. 
Wen,  Rev.,  131,  2595-2607, 

Lorenz,  E,  N.*  1963:  Deterministic  nonperiodic  flow.  J ,  Atmos. 
ScL,  20,  130-141. 

Miller,  R.  N..  M.  Ghil.  and  F,  Gauthiez,  1994:  Advanced  data 
assimilation  in  strongly  nonlinear  dynamical  systems.  J.  ,4r- 
mas.  ScL  51,  1037-1056. 

— ■ ,  E,  F.  Carter,  and  S.  T.  Blue,  1999:  Data  assimilation  into 
nonlinear  stochastic  models.  Tell  us,  5  LA.  167-194, 

Muccino.  J.  C.  and  A.  F,  Bennett.  2CK12:  Generalized  inversion  of 
the  Korteweg-De  Vries  equation,  Dyn  Ah  nos  Oceans,  35 
(3).  227-263. 

Ngudock.  H.  E„  B.  S.  Chuu,  and  A,  F.  Bennett,  2tXK):  C  icnerali/ed 
inversion  of  a  reduced  gravity  primitive  equation  ocean 
model  and  tropical  atmosphere  ocean  dal  a.  Mon.  VVVo.  Rev., 
128,  1757-1777. 

Uholdi,  F.,  and  M.  K antacid,  2000;  Time-space  weak  -const  taint 
data  assimilation  for  nonlinear  models,  t'vflus,  52A,  4 1 2-42 L 

Xu.  L.,  and  R.  Daley.  2000:  Towards  a  true  4-dimensional  data 
assimilation  algorithm:  Application  of  a  cycling  representer 
algorithm  to  a  simple  transport  problem.  Telftts,  52A,  I  IN 
128. 

- ,  and - ,  2002:  Data  assimilation  with  a  Ixirot  Topically  un¬ 
stable  shallow  water  system  using  representer  algorithms. 
Tettus.  54 A,  125-137. 


